Word Length in Estonian Prose

نویسنده

  • Peter Grzybek
چکیده

The present study deals with the problem of word length in Estonian prose. As is well known from quantitative and synergetic linguistics, word length is no isolated phenomenon; rather, it stands in close interrelations with word frequency, sentence and syllable length, and others, resulting in language as a dynamically balanced system. Moreover, the frequency with which words of a given length occur is no haphazard or chaotic phenomenon, but organized regularly, in a law-like manner. In this respect, the necessarily interdisciplinary approach to this issue may not only be helpful for analogical studies in other fields as well; it may also help to bridge the gap between what is usually juxtaposed in terms of ‘soft’ vs. ‘hard’, ‘human’ vs. ‘natural’ sciences, and the like. Since the results to be obtained quite obviously depend upon a number of various factors – e.g., the definition of ‘word’ itself, as well as of its constituting elements, the choice of a paradigmatic vs. syntagmatic approach (i.e. of dictionary vs. text material), the study of lemmas vs. word forms, etc. – relevant theoretical linguistic aspects are initially discussed, before the linguistic material to be investigated is presented: on the whole, five novels from modern Estonian authors (Pärtel Ekman, Jaan Kross, Reet Kudu, Viivi Luik) are analysed, chapter per chapter, summing up to an amount of ca. 1⁄4 million words, or ca. 20,000 sentences. As a result, the (discrete) Zipf-Alekseev distribution turns out to be an excellent model for word length frequencies of Estonian prose texts, what paves the way for future studies in various perspectives: generally speaking, the result allows for a qualitative interpretation in terms of a diversification process; more concretely, a solid basis is provided, not only for further intra-lingual studies of Estonian (including factors such as different discourse types, author-specific styles, periods of language development, etc.), but also for systematic comparative inter-lingual studies (including language specifics, parameter interpretation, etc.).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Found in Translation: The Reception of Andrei Ivanov’s Prose in Estonia1

Andrei Ivanov (b. 1971) is the most well known Estonian Russianlanguage writer who has won many literary awards in Estonia and Russia. His prose and position in the literary field of Estonia has initiated the discussion about the exact definition of Estonian literature and the status of the Estonian Russian-language literature. Due to Ivanov’s prose, the world of Estonian Russians has become mo...

متن کامل

Experiments on predictability of word in context and information rate in natural language

Based on data from a large-scale experiment with human subjects, we conclude that the logarithm of probability to guess a word in context (unpredictability) depends linearly on the word length. This result holds both for poetry and prose, even though with prose, the subjects don’t know the length of the omitted word. We hypothesize that this effect reflects a tendency of natural language to hav...

متن کامل

Study of the effect of Estonian and aqueous extract of Persian walnut tree leaf (Juglans regia) on growth indicators in western white shrimp farmed (Litopenaeus vannamei)

The aim of this study was to investigate the effects of Estonian and aqueous extracts of Persian walnut leaves on the performance of growth indices in western white shrimp (Litopenaeus vannamei). Materials and methods included 6 treatments of shrimp with different concentrations of 100, 200 and 300 mg/kg aqueous and Estonian extracts of Persian walnut leaves in the diet and 2 negative control t...

متن کامل

Brain potentials elicited by words: word length and frequency predict the latency of an early negativity.

Prior work has suggested that open- and closed-class words elicit negative components in the event-related potential (ERP) that differ in timing and scalp distribution. We tested this hypothesis against the possibility that the word-class effects are attributable to quantitative differences in word length and frequency. Event-related brain potentials (ERPs) were recorded from 13 scalp sites whi...

متن کامل

Isometric Lineation in English Texts: An Empirical and Mathematical Examination of its Character and Consequences

In this paper we build on earlier observations and theory regarding word length frequency and sequential distribution to develop a mathematical characterization of some of the language features distinguishing isometrically lineated text from unlineated text, in other words the features distinguishing isometrical verse from prose. It is shown that the frequency of Qn of n syllables making comple...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016